691 research outputs found
AISHELL-1: An Open-Source Mandarin Speech Corpus and A Speech Recognition Baseline
An open-source Mandarin speech corpus called AISHELL-1 is released. It is by
far the largest corpus which is suitable for conducting the speech recognition
research and building speech recognition systems for Mandarin. The recording
procedure, including audio capturing devices and environments are presented in
details. The preparation of the related resources, including transcriptions and
lexicon are described. The corpus is released with a Kaldi recipe. Experimental
results implies that the quality of audio recordings and transcriptions are
promising.Comment: Oriental COCOSDA 201
Modeling Hybrid Novel Traits: A case study in complex petal pigment patterning in hybrid Mimulus
Hybridization between species, by introducing dramatic trait variation into the population and creating viable, transgressive offsprings with novel phenotypes, can have huge evolutionary implications. Some hybrid traits have been studied in the classical genetics or population genetics context, but most complex traits are determined by multiple causes, e.g. the number of loci involved, the rewiring of the genetic circuitries, and the changes in gene expression pattern. Using the hybrid monkeyflower petal pigment patterning as an example, we present a case study to investigate complex hybrid traits in a systematic manner that includes empirical data analysis and quantitative mathematical modeling of the petal spot patterning trait in the F2 population. We identified candidate loci for a potential Turing-like dynamics that regulate the trait and simulated a 2-D F2 trait space with hybrid genetics assumptions that determine the pattern variations. Our study provides a fresh angle to study complex hybrid traits, and the workflow can be applied to other similar systems
Pan-cancer evaluation of gene expression and somatic alteration data for cancer prognosis prediction
Background: Over the past decades, approaches for diagnosing and treating cancer have seen significant improvement. However, the variability of patient and tumor characteristics has limited progress on methods for prognosis prediction. The development of high-throughput omics technologies now provides multiple approaches for characterizing tumors. Although a large number of published studies have focused on integration of multi-omics data and use of pathway-level models for cancer prognosis prediction, there still exists a gap of knowledge regarding the prognostic landscape across multi-omics data for multiple cancer types using both gene-level and pathway-level predictors. Methods: In this study, we systematically evaluated three often available types of omics data (gene expression, copy number variation and somatic point mutation) covering both DNA-level and RNA-level features. We evaluated the landscape of predictive performance of these three omics modalities for 33 cancer types in the TCGA using a Lasso or Group Lasso-penalized Cox model and either gene or pathway level predictors. Results: We constructed the prognostic landscape using three types of omics data for 33 cancer types on both the gene and pathway levels. Based on this landscape, we found that predictive performance is cancer type dependent and we also highlighted the cancer types and omics modalities that support the most accurate prognostic models. In general, models estimated on gene expression data provide the best predictive performance on either gene or pathway level and adding copy number variation or somatic point mutation data to gene expression data does not improve predictive performance, with some exceptional cohorts including low grade glioma and thyroid cancer. In general, pathway-level models have better interpretative performance, higher stability and smaller model size across multiple cancer types and omics data types relative to gene-level models. Conclusions: Based on this landscape and comprehensively comparison, models estimated on gene expression data provide the best predictive performance on either gene or pathway level. Pathway-level models have better interpretative performance, higher stability and smaller model size relative to gene-level models
Stress analysis of rigid hanger of railway arch bridge based on vehicle-bridge coupling vibration
In order to study the stress of two new types of rigid hangers (circular steel and flat-plate rigid hangers) on the railway arch bridges, a finite element model of a railway through arch bridge was established. The influences of different types and sizes of hangers on the dynamic characteristics of the bridge were compared. Based on the established vehicle-bridge coupling vibration model, the influences of circular steel and flat-plate hanger sizes on the stress amplitude of hanger were discussed when the train passes through the bridge. The results show that when the flexible hanger of arch bridge was replaced by the rigid hanger, the symmetrical vertical bending frequency of bridge significantly increased. With the change of the size of flat-plate hanger, the torsional mode of the bridge was doped with the local vibration of the flat-plate hanger. With the increase of circular steel hanger diameter, the maximum stress amplitude of the hanger decreases as a whole. As for the flat-plate hanger, when the long side size b is the same, the maximum stress amplitude of the hanger decreases with the increase of the short side size d. When the short side size d is the same, with the increase of the long side size b, the maximum stress amplitude of the shorter hanger decreases, and the maximum stress amplitude of the longer hanger increases. When the size of the flat-plate hanger is too small or too large, the maximum stress amplitude is large
Prognostic nomogram for bladder cancer with brain metastases: a National Cancer Database analysis.
BACKGROUND: This study aimed to establish and validate a nomogram for predicting brain metastasis in patients with bladder cancer (BCa) and assess various treatment modalities using a primary cohort comprising 234 patients with clinicopathologically-confirmed BCa from 2004 to 2015 in the National Cancer Database.
METHODS: Machine learning method and Cox model were used for nomogram construction. For BCa patients with brain metastasis, surgery of the primary site, chemotherapy, radiation therapy, palliative care, brain confinement of metastatic sites, and the Charlson/Deyo Score were predictive features identified for building the nomogram.
RESULTS: For the original 169 patients considered in the model, the areas under the receiver operating characteristic curve (AUC) were 0.823 (95% CI 0.758-0.889, P \u3c 0.001) and 0.854 (95% CI 0.785-0.924, P \u3c 0.001) for 0.5- and 1-year overall survival respectively. In the validation cohort, the nomogram displayed similar AUCs of 0.838 (95% CI 0.738-0.937, P \u3c 0.001) and 0.809 (95% CI 0.680-0.939, P \u3c 0.001), respectively. The high and low risk groups had median survivals of 1.91 and 5.09 months for the training cohort and 1.68 and 8.05 months for the validation set, respectively (both P \u3c 0.0001).
CONCLUSIONS: Our prognostic nomogram provides a useful tool for overall survival prediction as well as assessing the risk and optimal treatment for BCa patients with brain metastasis
Dynamic Local Attention with Hierarchical Patching for Irregular Clinical Time Series
Irregular multivariate time series data is prevalent in the clinical and
healthcare domains. It is characterized by time-wise and feature-wise
irregularities, making it challenging for machine learning methods to work
with. To solve this, we introduce a new model architecture composed of two
modules: (1) DLA, a Dynamic Local Attention mechanism that uses learnable
queries and feature-specific local windows when computing the self-attention
operation. This results in aggregating irregular time steps raw input within
each window to a harmonized regular latent space representation while taking
into account the different features' sampling rates. (2) A hierarchical MLP
mixer that processes the output of DLA through multi-scale patching to leverage
information at various scales for the downstream tasks. Our approach
outperforms state-of-the-art methods on three real-world datasets, including
the latest clinical MIMIC IV dataset.Comment: Findings of Machine Learning for Health (ML4H) 202
Prioritized Planning for Target-Oriented Manipulation via Hierarchical Stacking Relationship Prediction
In scenarios involving the grasping of multiple targets, the learning of
stacking relationships between objects is fundamental for robots to execute
safely and efficiently. However, current methods lack subdivision for the
hierarchy of stacking relationship types. In scenes where objects are mostly
stacked in an orderly manner, they are incapable of performing human-like and
high-efficient grasping decisions. This paper proposes a perception-planning
method to distinguish different stacking types between objects and generate
prioritized manipulation order decisions based on given target designations. We
utilize a Hierarchical Stacking Relationship Network (HSRN) to discriminate the
hierarchy of stacking and generate a refined Stacking Relationship Tree (SRT)
for relationship description. Considering that objects with high stacking
stability can be grasped together if necessary, we introduce an elaborate
decision-making planner based on the Partially Observable Markov Decision
Process (POMDP), which leverages observations and generates the least
grasp-consuming decision chain with robustness and is suitable for
simultaneously specifying multiple targets. To verify our work, we set the
scene to the dining table and augment the REGRAD dataset with a set of common
tableware models for network training. Experiments show that our method
effectively generates grasping decisions that conform to human requirements,
and improves the implementation efficiency compared with existing methods on
the basis of guaranteeing the success rate.Comment: 8 pages, 8 figure
- …